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Abstract — In contrast to the existing approaches to bisimula- 
tion for fuzzy systems, we introduce a behavioral distance to 
measure the behavioral similarity of states in a nondeterministic 
fuzzy-transition system. This behavioral distance is defined as 
the greatest fixed point of a suitable monotonic function and 
provides a quantitative analogue of bisimilarity. The behavioral 
distance has the important property that two states are at zero 
distance if and only if they are bisimilar. Moreover, for any given 
threshold, we find that states with behavioral distances bounded 
by the threshold are equivalent. In addition, we show that 
two system combinators — parallel composition and product — 
are non-expansive with respect to our behavioral distance, which 
makes compositional verification possible. 

Index Terms — Behavioral distance, bisimulation, fuzzy au- 
tomaton, fuzzy-transition system, non-expansiveness, pseudo- 
ultrametric. 



I. Introduction 

ONE of the most important contributions of concurrency 
theory to computer science is the concept of bisimulation 
(see, for example, |1|-|3| and the bibliographies therein). It 
expresses when two systems can behave in the same way 
in the sense that one system simulates the other and vice- 
versa; intuitively, two systems are bisimilar if they match each 
other's moves. In addition to testing behavioral equivalence, 
bisimulation allows one to reduce the state space of a system 
by combining bisimilar states. 

Recently, bisimulation techniques have been introduced to 
fuzzy automata, or more generally, fuzzy systems. Two general 
approaches can be recognized in the existing literature. One is 
based on a binary relation on the state space of a fuzzy system 
such that related states have exactly the same possibility degree 
of making a transition into every class of related states [4|- 
(SJ. Roughly speaking, this approach is a fuzzy analogue 
of the classical probabilistic bisimulation initiated by Larsen 
and Skou ||9|. More concretely, in |4|, Petkovic introduced 
the notion of congruence for fuzzy automata by following 
the algebraic theory of classical automata. In fact, such a 
congruence is nothing other than a bisimulation. Based on the 
concept of congruence, an improved minimization algorithm 
for fuzzy automata has been developed. In |5|, Buchholz 
has put forward a general definition of bisimulation for 
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weighted automata over a generic semiring. By instantiating 
the semiring to the closed unit interval [0, 1] with binary 
operations max and min. Sun et al. addressed the forward and 
backward bisimulations for fuzzy automata |6|. More recently, 
the Cao et al. investigated bisimulation for deterministic and 
nondeterministic fuzzy systems, which may be infinite-state or 
infinite-event, in |7| and |8|, respectively, and applied it to the 
specification of nondeterministic fuzzy discrete-event systems 

®' ' r-i 

The other approach, from Ciric and his colleagues 10 - 
fTSl, is based on a fuzzy relation on the state space. They 
introduced two types of simulations (forward and backward) 
and four types of bisimulations (forward, backward, forward- 
backward, and backward-forward) for fuzzy automata, which 
are all defined as fuzzy relations | fT3| , p3) . In particular, they 
proved that the greatest (forward) bisimulation on a fuzzy 
automaton is a fuzzy equivalence relation. Furthermore, the 
state reduction of fuzzy automata and some related algorithms 
for computing the greatest simulations and bisimulations have 
been well developed. Remarkably, Ciric et al. have found 
that the state reduction problem for fuzzy automata is closely 
related to the problem of solving certain systems of fuzzy re- 
lation equations |10|, |T2|, 1 14 1, which provides a new insight 



into the theory of fuzzy relational equations and inequalities. 

For the first approach, we observe that the bisimulation 
based on a crisp relation, in which states are either bisimilar 
or not, is not a robust concept, since states that used to 
be bisimilar may not be anymore or vice versa if some of 
the possibility degrees change slightly. This is particularly 
unfortunate for the reason that such possibility degrees are 
often obtained experimentally, or are given as approximations. 
Therefore, it may not make sense to say that two states are 
exactly bisimilar. In terms of robustness, the bisimulation 
based on a fuzzy relation is much better than the exact 
bisimulation. This advantage leads to many better results in 
the state reduction of fuzzy automata p2), 1 14|. 



Inspired by earlier work on probabilistic concurrent systems 



1 16 1-| 21 1, in the present paper we exploit a pseudo-ultrametric 
to measure the similarity of states in a (nondeterministic) 
fuzzy-transition system (FTS). A pseudo-ultrametric is a func- 
tion that yields a nonnegative real number distance for each 
pair of states. Such a pseudo-ultrametric gives rise to a quanti- 
tative analogue of exact bisimilarity in that the behavioral dis- 
tance between states captures the similarity of the behavior of 
those states. The smaller the behavioral distance, the more the 
states behave similarly. In particular, the behavioral distance 
between states is if and only if they are exactly bisimilar 
Moreover, for any given threshold, we can partition the state 
space such that the behavioral distance between the states in 
the same block is bounded by the threshold. In other words, 
states with behavioral distances bounded by the threshold are 
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equivalent. Technically, we define the behavioral distance as 
the greatest fixed point of a suitable monotonic function, while 
the post-fixed points of the monotonic function provide a 
characterization of exact bisimulation. We also show that two 
system combinators — ^parallel composition and product — are 
non-expansive with respect to our behavioral distance. This 
allows us to use the behavioral distance for compositional 
reasoning. 

Some related research should be distinguished before intro- 
ducing the organization of the paper Broadly speaking, our 
behavioral distance is a fuzzy analogue of the pseudo-metric 
used in some probabilistic systems 1 16|-plJ. Nevertheless, 
building such an analogue is not trivial, and we have to develop 
a whole new framework. For example, our lifting of pseudo- 
ultrametrics from states to possibility distributions answers an 
open problem on the existence of the fuzzy analogue of Kan- 
torovich metric raised by Repovs et al. in f22|. Our approach 
is rather different from the one by Ciric et al. |10|-|15| in 
at least two aspects: One is that we are using the pseudo- 
ultrametric, which is related in spirit to the Hutchinson metric 
and the Hausdorff distance, to provide a robust and quantitative 
notion of behavioral equivalence, while the second approach is 
dependent on finding a solution to a particular system of fuzzy 
relation equations or inequalities. The main feature of the latter 
is the intensive use of fuzzy relation calculus and systems of 
fuzzy relation equations and inequalities. The other difference 
is that the underlying systems addressed by the second ap- 
proach are deterministic in the sense that the next possibility 
distribution is determined by the current state and event, while 
we are concerned with more general nondeterministic fuzzy 
systems. It should be noted that nondeterminism is essential 
for modeling scheduling freedom, implementation freedom, 
the external environment, and incomplete information (see, for 



example, (23 1). 

The rest of the paper is organized as follows. In Section 
II, we collect a few necessary notations and notions of fuzzy 
sets and FTSs. Section III embarks upon the development of 
behavioral distance. It starts by lifting a pseudo-ultrametric 
from states to possibility distributions. Based on the lifting, 
we then define a function on a set of pseudo-ultrametrics 
and discuss the monotony of the function. Thanks to Tarski's 
fixed point theorem, we get the greatest fixed point of the 
function and define it as our behavioral distance. In Section 
IV, after establishing a lifting of relation from states to possi- 
bility distributions, we justify the soundness of the behavioral 
distance by disclosing the relationship between the distance 
and bisimilarity. In the subsequent section, we investigate the 
non-expansiveness of the behavioral distance with respect to 
the parallel composition and product operators. The paper is 
concluded in Section VI with a brief discussion of future work. 

II. Fuzzy-Transition Systems 

In this section, we recall some basic notions of fuzzy sets 
and FTSs. 

Let X be a universal set. A fuzzy subset of X (or simply 



[0, 1]. Such a function is called a membership function; the 
value ^{x) characterizes the degree of membership of x in 
ji. A fuzzy subset of X can be used to formally represent a 
possibility distribution on X. 

The support of a fuzzy set /i is a crisp set defined as 
supp(/i) — {x X : fj,{x) > 0}. Whenever supp(/i) is finite, 
say supp(/i) = {xi,X2, ■ ■ ■ ,a;„}, we may write the fuzzy set 
/i in Zadeh's notation as follows: 

fi{xi) fi{x2) , , fJ.{Xn) 



M = 



Xi X2 
1 



fuzzy set |24|), /x, is defined by a function assigning to each 
element x of X a value in the closed unit interval 



With this notation, ^, denoted by x, is a singleton in X, i.e., 
the fuzzy subset of X with membership 1 at x and with zero 
membership for all the other elements of X. 

We denote by J-{X) the set of all fuzzy subsets of X (i.e., 
possibility distributions on X) and by ViX) the power set of 
X. For any fi,ri E ^{X), we say that fi is contained in r] (or rj 
contains /i), denoted by /i C yy, if < rj{x) for all x E X. 
Notice that /i = 77 if both n Q t] and 77 C /i. A fuzzy set is 
said to be empty if its membership function is identically zero 
on X. We use to denote the empty fuzzy set. 

For any family A^, i e /, of elements of [0, 1], we write 
Vig/Ai or V{Ai : i e /} for the supremum of {Ai : i G /}, 
and Ai^jXi or A{Ai : i £ 1} for the infimum. In particular, if 
/ is finite, then Vi^jXi and A^g/Ai are the greatest element 
and the least element of {A^ : i S /}, respectively. 

For any given c e [0, 1] and /i G J^{X), the scale product 
c- ji of c and ^ is defined by 

(c • = c A /i(a;), 

for each x £ X\ this is again a fuzzy set. Given n,r] G ^{X), 
the union of /i and rj, denoted /i U rj, is defined by the 
membership function 

{jJL U rj){x) = ji(x) V ri{x) 

for all X £ X. For any ji E J'iX) and U C X, the notation 
/i(C/) stands for \/x£Ul^{x)- 

We now review the concept of FTSs. In |]7|, an FTS is 
defined as a four-tuple {S,A,6,sq), where S is the set of 
states, A is the set of labels, (5 is a mapping from S x A to 
J^{S), and So is the initial state. Labels in an FTS can represent 
different things. Typical uses of labels include representing 
input expected, conditions that must be true to trigger the 
transition, or actions performed during the transition. If the 
label set is a singleton, the system is essentially unlabeled, and 
a simpler definition that omits the labels is possible. Intuitively, 
if the FTS is in state s E S and the label a E A occurs, then it 
may go into the state s' E S with possibility degree 5{s, a){s'). 
Such an FTS is deterministic in the sense that for each state s 
and label a, only a possibility distribution 6{s,a) is returned 
by 5. 

Following |[8|, | [25) , in this work we address a more general 
FTS by taking into account nondeterminism. As a result, for 
each state s and label a, more than one possibility distribution 
may be returned by S. 

Definition 1: A (nondeterministic) fuzzy-transition system 
(FTS) is a three-tuple (S*, A, 5), where 
(1) 5 is the set of states. 
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(2) A is the set of labels, and 

(3) 5, called a fuzzy transition function, is a mapping from 
5 X ^ to V{F{S)). 

If (5, A, 5) is an FTS such that s e S", a e A, /i e F{S), 
and /i S <5(s, a), we write s /i and call it a. fuzzy transition. 
An FTS is said to be finite if both S and A are finite, and 
infinite otherwise. For simplicity, we have excluded the initial 
state from consideration and will work exclusively with finite 
FTSs in the paper. 

III. Behavioral Distance 

This section, consisting of three subsections, is devoted to 
quantifying the behavioral similarity between states of an FTS. 
This quantity, which is based upon the fuzzy transitions de- 
rived from the states, meets some desirable metric properties. 

A. Lifting of Pseudo-Ultrametric 

In this subsection, we lift a pseudo-ultrametric from states to 
possibility distributions. Let us first collect some basic notions 
on pseudo-ultrametric space. 

Definition 2: Let X be a nonempty universe. A function 
d : X X X — > [0, 1] is called a pseudo-ultrametric on X if 
for all x,y,z E X, 

(PI) d{x,x) = 0, 

(P2) d{x, y) = x), and 

(P3) d{x, z) < d{x, y) V d{ij, z). 
The couple (X, d) is called a pseudo-ultrametric space. If the 
triangle inequality (P3) is weakened by 

(P3') d{x,z)<d{x,y) + d{y,z), 
then d is called a pseudo-metric and {X, d) is a pseudo-metric 
space. At the same time, if (PI) is strengthened by 

(PL) d{x, y) = if and only if a; = y, 
that is, d satisfies (PL), (P2), and (P3'), then d is called a 
metric and (X, d) is a metric space. 

To simplify notation, we sometimes write X instead of 
{X,d). Trivially, the constant function that maps any pair 
[x, y) to is a pseudo-ultrametric. In addition, the discrete 
metric, where d{x, y) ~Q\f x = y and d{x, y) = 1 otherwise, 
is a pseudo-ultrametric. 

The notion of pseudo-ultrametric has an intuitive interpre- 
tation |26|: If G is an edge-weighted undirected graph, all 
edge weights are in [0,1], and d{x,y) is the weight of the 
minimax path between vertices x and y (that is, the maximum 
weight of an edge, on a path chosen to minimize this maximum 
weight), then the vertices of the graph, with distance measured 
by d, form a pseudo-ultrametric space, and all finite pseudo- 
ultrametric spaces can be represented in this way. 

A special metric satisfying (P3) appeared in several areas 
of mathematics in early 20th century (K. Hensel, 1904; R. 
Baire, 1909; F. Hausdorff, 1914). The corresponding metric 
space is called ultrametric or non-Archimedean. A brief review 
on ultrametrics, including a wide list of the references, can 
be found in Lemin's paper pTj . Recently, some significant 
examples of pseudo-metric spaces have arisen in probabilistic 
transition systems pO) , pTj, M arkov chains p6) , |17|, and 
Markov decision processes yl8|, p9). 



For simplicity, we restrict ourself to [0, 1] -valued pseudo- 
ultrametrics. Notice that every pseudo-ultrametric d' with 
codomain [0, +00) can be homeomorphically transformed into 
a [0, 1] -valued pseudo-ultrametric d by defining 

d'{x,y) 



d{x,y) = 



1 



0.3 
t ' 



and 77 



d'{x,yy 

for all X and y m X | |28| . 

From now on, we fix a finite FTS {S^A,5). Let T) be the 
set of all pseudo-ultrametrics on S. Clearly, T) ^ %. Given a 
d £ T>, we need to lift it to a pseudo-ultrametric on J^{S). 
The lifting is based on the following observation. 

Lemma 1: For any /i,?/ e ^{S), consider the following 
system: 

ysesxst^vit), yteS (1) 
Xst > 0, y s,t £ s 

Then it has a solution if and only if fJ,{S) = '7(5'). 

Proof See Appendix A. ■ 

To understand the above lemma and its proof, let us see a 
simple example. 

Example 1: Letting S* — {s,t], ji — 
i + 2^, the system (mi reduces to 

Xss V Xst = 0.9 
xts V xtt = 0.3 

Xqs V Xig — 1 

Xst V Xtt = 0.5 

Xst >oys,t 

The third equation implies that either Xss = 1 or xts = 1. If 
Xss = 1, it contradicts with the first equation; while if xts = 1, 
it contradicts with the second equation. Therefore, the above 
system has no solution. However, if we consider = ^ ~\- ^ 
and take Xst by following the proof of the lemma, it gives rise 
to a solution to the system ([T]i: Xss = 0.9, = 0.5, Xts = 
0.3, and xtt — 0. 

With the aid of the above lemma, we can now state the 
concept of lifting. 

Definition 3: Let d E V. For any ^,?7 G ^{S), if l^iS) 7^ 
ri{S), we define d{ii,j]) = 1; otherwise, we define d{^,ri) 
as the value of the following mathematical programming 
problem: 



minimize \/s,tes id{s, t) A Xst) 
subject to \/teSXst^ Ks), 

Vses Xst = v{t), 

Xst > 0, 



VseS (MP) 
yteS 
ys,te S 



Let us revisit Example [T] If is the discrete metric on S* = 
{s, t}, then it follows readily from Definition [3] that d{ii, 11) ~ 
d{r,,T^) = die, 9)^^ 0, %77) = d{r,,fi) = 1, d{i^,0) = 
d{9,rj) = 1, and d{ii,6) — d{9,iJL) — 0.5. In terms of these 
values, d satisfies all the requirements of pseudo-ultrametric. 
Before exploring the universality of this property, we pause to 
give two remarks. 

Remark 1: Whenever /i(S') — ri{S), it follows from Lemma 
[T] that the system ([TJ has a solution. It implies that the 
mathematical programming problem (MPi in Definition [3] 
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has a feasible solution. In this case, it is not difficult to 



observe that there exists an optimal solution to (MPi by 
considering all the alternatives of x^t from the finite set 
{^(s) : s e S*} U {ri{t) : t G S*} U {0}. Hence, Definition 
|3]is well-defined. 

Remark 2: In fact. Definition |3] is inspired by the duality of 
Kantorovich metric on probability measures. Roughly speak- 
ing, the Kantorovich metric and its duality provide a way of 
measuring the distance between two probability distributions. 
We refer the reader to | [29] for the history of the Kantorovich 
metric and to f30l for its applications in probabilistic con- 
currency, image retrieval, data mining, and bioinformatics. 
Definition |3] is derived from the duality of Kantorovich metric 
by replacing the sum and product operations therein by max 
and min, respectively. Most recently, Repovs et al. |.22J posed 
an open question: Is there a fuzzy analogue of the Kantorovich 
metric on the set of probability measures? Our Definition [3] 
offers up a solution to this problem. 

The desirability of Definition [3] is justified by the following 
fact. 

Theorem 1: For each d e P, d is a pseudo-ultrametric on 

F{S). 

Proof: See Appendix A. ■ 

B. A Monotonic Function on a Complete Lattice 

For later need, we start by discussing the lattice structure 
on V and then define a suitable monotonic function on T). To 
this end, we need to endow T) an order 

Definition 4: The order ^ on 2? is defined by 

di ^ d2 if di{s,t) > d2{s,t) for all s,t £ S. 

It is easy to check that ^ is indeed a partial order on D. 

Recall that a partially ordered set {X, <) is called a com- 
plete lattice if every subset of X has a supremum and an 
infimum in {X,<). We now show that V endowed with the 
order specified in Definition |4] forms a complete lattice. 

Lemma 2: {V, ^) is a complete lattice. 

Proof: See Appendix A. ■ 

For any d e I?, we can get by Theorem [T] a pseudo- 
ultrametric d on possibility distributions. Further, let us extend 
d to a distance measure on the sets of possibility distributions. 

Recall that the well-known Hausdorff distance measures 
how far two subsets of a metric space are from each other 
Informally, the Hausdorff distance is the longest distance of 
either set to the nearest point in the other set. For our purpose, 
we consider the Hausdorff distance for a pseudo-ultrametric 
space. 

Definition 5: Let {X, d) be a pseudo-ultrametric space. For 
any x G X and A C X, define 

AaeAd{x,a), if A ^ ID 
1. otherwise. 



d{x,A) = 



Further, given a pair A,B C X, the Hausdorff distance 
induced by d is defined as 

H (A m = /0' if A = B = 9 

^d[^, <y [y^^^d{a, B)] V [VbeBdih, A)] , otherwise. 

As expected, Hd has the following property. 



Lemma 3: If d is a pseudo-ultrametric on X, then Hd is a 
pseudo-ultrametric on V{X). 

Proof: It follows directly from Definition [5] and we thus 
omit the proof. ■ 

Observe that by Theorem [T] any d E V induces a pseudo- 
ultrametric d on J^{S); further, it yields a pseudo-ultrametric 
on V{F{S)) by the Hausdorff distance. Based on H^, let 
us define a function A on V. 

Definition 6: The function A : T) — > T) is defined as 
follows: For any d E D, A{d) is given by 

A{d){s,t) = \/aeAH^{Sis,a),S{t,a)) 

for all s,t e S. 

There is no difficulty to check that A{d) S 2? by Lemma 
[3] and thereby, A is well-defined. 

Our next objective is to show that A has a greatest fixed 
point with respect to the partial order in Definition |4] Recall 
that the remarkable Tarski's theorem says that each monotonic 
function on a complete lattice has a greatest fixed point pT) . 
Therefore, to show that A has a greatest fixed point, it remains 
to verify that A is monotonic with respect to ^. 

Recall that for a partially ordered set {X, <), a function 
/ : X — > X is said to be monotonic if for all xi,X2 G X, 
xi < X2 implies that f{xi) < f{x2)- 

Lemma 4: The function A : T) — > T) is monotonic with 
respect to the partial order <. 

Proof: See Appendix A. ■ 

C. Behavioral Distance 

Based on the results obtained in the previous subsections, 
we can now define the behavioral distance and present some 
interesting properties. 

Recall that for any function / : X — > X, an element 
X G X is called a fixed point of / if x = f{x)- By using the 
lemmas established above, we get a fixed point of A. 

Theorem 2: The function A : V — > T) has a greatest fixed 
point given by 



An 



U{dGV : d^ A{d)} . 



Proof: Both the existence and explicit representation of 
the greatest fixed point follow immediately from Lemmas |2] 
and [4j and Tarski's fixed point theorem. ■ 
The greatest fixed point Amax is a pseudo-ultrametric on 
S, which serves as a distance measure on the state set of an 
FTS. 

Definition 7: Let {S,A,5) be an FTS. For any s,i G S, 
the behavioral distance between s and t, denoted df{s,t), is 
defined as 

df{s,t) = Amax(s,i)- 

Due to the ultrametricity of df, we can make a useful 
observation. 

Corollary 1: For any given A e [0,1], define Rx — 
{{s,t) G S X S : df{s,t) < A}. Then R\ is an equivalence 
relation on S. 

Proof: It follows directly from the three properties of 
pseudo-ultrametric. ■ 
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The significance of this corollary is that setting a threshold 
A, which depends on the particular application considered, one 
can partition the state space such that any two states in the 
same block have distance at most A. We defer describing Rq 
in the next section. 

As an immediate consequence of Theorem |2] we get a way 
to calculate the behavioral distance for image-finite FTSs. 
Here, by image-finiteness we mean that for any s G S and 
a G A, the cardinality of S{s, a) is finite. 

Corollary 2: Let {S,A,6) be an image-finite FTS. Define 
AO(T) = T and A"+i(T) = A(A"(T)), where T is given 
by T(s, i) = for all s,t e S. Then 

d/ = A„,ax = n{A"(T) :nelN}. 

Proof: See Appendix A. ■ 
For the sake of illustrating the above notions and results, 
we give a simple example. 

Example 2: Consider the FTS shown in Fig. [T] where the 
sates are in circles and each fuzzy transition is depicted via 
two parts: an arrow for nondeterministic choice (for simplicity, 
in this example we associate to each state at most one fuzzy 
transition) and a bunch of arrows for the possibility degrees of 
entering next states. Formally, S — {si, S2, S3, S4}, A = {a}, 
and S is defined as follows: 

S{si,a) = {fi}, 6{s2,a) = [t]}, 
d{s3,a) = {9}, (5(s4,a) = 0, 

where 

0.9 0.8 0.6 0.9 , , 0.9 

^= \ , rj ^ 1 , and & ^ — . 

S3 S4 S3 S4 S4 

Using Corollary |2] we compute df{si,Sj) by iteration of 
A starting from the greatest element T. Let us write dn for 
A"(T). By the properties (PI) and (P2) of pseudo-ultrametric, 
we only need to calculate d„(si,Sj) with i < j. Moreover, 
it follows by the monotony of A that dn+i{si,Sj) = 1 
if dn{si,Sj) — 1. These observations greatly simplify the 
computation below. By Definitions |3] and |5] we have the 
following: 

rfi(si,S2) = A{T){si,S2) = Hf {S{si,a),S{s2,a)) 

= i/t(M'M) = T(M,'7) = 0, 
rfi(si,S3) = t(^, 6') = 0, 
di(si,S4) = t(^, 0) = 1, 
rfl(s2,S3) - f{f],0)^O, 
dl(s2,S4) = t(7;,0) = l, 
dl(s3,S4) = t(0,0) = l. 

For clarity, we may represent di by the matrix {di{si, Sj)), 
i.e., 

"0 1' 

1 
^^ 1 
1110 
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Fig. 1. An FTS. 

Let us go ahead. 

rf2(si,S2) = A(rfi)(si,S2) = ((5(si,a),(5(s2,a)) 

= i/,-JM,M)=di(A^,^) = 0.9, 
c?2(si,S3) = 6*) = 0.9, 

d2{s2,S3) = di{r],9) ^ 0.6. 
Again, we represent d2 by the matrix 



d2 = 






0.9 


0.9 


1" 


0.9 





0.6 


1 


0.9 


0.6 





1 


1 


1 


1 






We need to proceed with iteration. Fortunately, the next step 
shows us that c?3 = ^2 and thus the iteration can be stopped 
at the third step. As a result, we obtain that df ~ d2, that is, 

df{s„Si) = 0, for alH = 1,2,3,4, 

df (32,83) = 0.6, 

rf/(si, S2) = df{si, S3) = 0.9, 

d/(si,S4) = d/(s2,S4) = d/(s3,S4) = 1, 

where the symmetric parts are elided. 

We end this section by relating the behavioral distance 
with the similarity relation (also known as fuzzy equivalence 
relation) proposed by Zadeh in |32|. Recall that a similarity 
relation on X is a binary fuzzy relation 5 on X (i.e., a 
function from X x X — > [0, 1]) that satisfies S{x^ x) — 1, 
S{x, y) — S{y, x), and S{x, y) A S{y, z) < S{x, z), for any 
x,y,z e X. We draw the reader's attention to the entire 
difference between Zadeh's similarity relation and the notion 
of bisimilarity in the subsequent section. 

The following observation implies that the less the value of 
df, the more similar the two states. 

Corollary 3: For any s,t € S, let S{s,t) = 1 — df{s,t). 
Then 5 is a similarity relation on S. 

Proof: It follows immediately from the fact that df is a 
pseudo-ultrametric on S. ■ 

IV. Relationship with Bisimilarity 

In this section, we embark upon the relationship between 
the behavioral distance and bisimilarity. More concretely, 
we will show that two states are bisimilar if and only if 
they have behavioral distance 0. In addition, we present a 
characterization of bisimulation by exploiting the function A. 
To this end, we need to lift a relation on states to a relation 
on possibility distributions. 
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A. Lifting of Relation 

The following notion of lifting is adopted from 1*331, where 
a similar notion was first defined for probability distributions. 

Definition 8: Given an R C S x S, the lifting i? of i? is 
defined as the smallest binary relation on J-{S) that satisfies 

(1) (s,t) G R implies (s,t) G R; 

(2) (/ij, r]t) e R impHes {[j Ci ■ fit, \J ■ rji) £ R, for any 

finite index set / and G [0, 1]. 
Note that s in the above definition stands for the possibility 
distribution -. 

s 

For later need, it is convenient to have two alternative 
presentations of the lifting. 

Lemma 5: Let fi,ri E J'iS) and i? C S x S. Then the 
following are equivalent. 

(1) ifi,ii)eR. 

(2) There exist {si,ti) G R and Ci G [0, 1], i E I, such that 
fJ- = U • Si and r/ = U ■ U- 

(3) There is a weight function w : S x S — > [0, 1] such 
that 

(a) Vtesw{s,t) = fi{s) for any s G S; 

(b) Vseswis, t) ^ r]{t) for any t G S; 

(c) w{s,t) > impHes (s,i) G R. 

Proof: See Appendix A. ■ 
For the special case of an equivalence relation, there is a 
simpler way to describe the lifting. 

Lemma 6: Suppose that /i, 77 G J^{S) and R is an equiv- 
alence relation on S. Then (/i, 77) G i? if and only if 
fi(C) = rj{C) for aU C G S/R. 

Proof See Appendix A. ■ 
As we have seen. Definition J2] lifts a pseudo-ultrametric d 
on 5 to a pseudo-ultrametric d on T{S), while Definition [s] 
lifts a binary relation i? on to a binary relation R on J-{S). 
With a little surprise, there is an intrinsic connection between 
them. 

Lemma 7: Suppose that i? is a binary relation and is a 
pseudo-ultrametric on S satisfying that for any s,t E S, 

(s, i) G i? if and only if d{s, t) = 0. (2) 

Then it holds that for any 77 G -^{3), 

{fj., rj) E R if and only if d{^, rj) = 0. (3) 

Proof: See Appendix A. ■ 
In order to characterize bisimulation via the function A, we 
need to define an auxiliary function. For any relation R C 
5 X S", we associate to it a function dB. ■ S x S — > [0, 1] 
defined by 



dRis,t) = 



0, if{s,t)ER 

1, otherwise. 



Let us make a useful observation on which sort relation 
makes dfj. into a pseudo-ultrametric. 

Lemma 8: Let i? C S x S. Then R is an equivalence 
relation if and only if da is a pseudo-ultrametric. 

Proof: It is straightforward. We thus omit the details. ■ 



B. Relationship 

Let us begin with the notion of bisimulation from fE], which 
is the nondeterministic version of the bisimulation in |7|. 

Definition 9: Let {S,A,S) be an FTS. An equivalence re- 
lation R C S X S is called a bisimulation on S if for any 
{s,t) G R and a E A, s — ^ /z implies t — ^ 77 for some rj 
such that n{C) = 77(C) for every C E S/R. 

The greatest bisimulation on S, denoted by ^, is called 
bisimilarity . In other words, s and t are bisimilar, written s ^ 
t, if {s,t) E R for some bisimulation R. 

It is easy to see that in Fig. [T] s,; 7^ Sj if i ^ j. 

We will characterize the bisimulation by the post-fixed 
points of A. Recall that for a partially ordered set {X, <) 
and a function / : X — > X, an element x of X is called a 
post fixed point of f if x < f{x). 

For the proofs of subsequent theorems, let us present an 
explicit characterization of the post-fixed points of A. 

Lemma 9: For any d E V, d is a post-fixed point of A if 
and only if for any {s,t) E S x S and a E A, s — % fi implies 
that there exists some 77 (possibly 0) such that t — s- 77 and 
d{li,v) < d{s,t). 

Proof: Note that li is a post-fixed point of A if and only 
if \JaeAH^{Sis,a),6{t,a)) < d{s,t) for all {s,t) E S x S. 
The remainder of the proof follows readily from the definition 
of Hausdorff distance. ■ 

We can now state the first main theorem in this section. 

Theorem 3: Let R be an equivalence relation on S. Then 
i? is a bisimulation if and only if is a post-fixed point of 
A. 

Proof: See Appendix A. ■ 
The next theorem characterizing bisimilarity justifies the 
soundness of our behavioral distance. 

Theorem 4: For any s.t E S, s ^ t if and only if df{s, t) — 

0. 

Proof: See Appendix A. ■ 
Following the notation used in Corollary [T] the above theo- 
rem means exactly that i?o As an immediate consequence 
of the theorem, we have the following. 

Corollary 4: For any s,s',t,t' E S, if s ^ s' and t ^ t', 
then df{s,t) = df{s' ,t'). 

Proof: See Appendix A. ■ 
It is well known that bisimulation equivalence allows one 
to reduce the state space of a system by combining bisimilar 
states to generate a quotient system with an equivalent be- 
havior but with fewer states. Therefore, the significance of 
Corollary |4] lies in that computing the behavioral distance 
between s and t is reduced to computing the behavioral 
distance between their quotients. 

V. NON-EXPANSIVENESS 

To construct an overall fuzzy system, one may usually 
build its component systems first and then compose them 
by some operators. Therefore, compositional operators can 
serve the need of modular specification and verification of 
systems. A desirable property of compositional operators is 
non-expansiveness. It means that if the difference (with respect 
to some behavioral measure) between Si and s'^ is e^, then the 
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difference between /(si, . . . , s„) and /(s'^, . . . , s^) is no more 
than Vf^iCi, where / is an operator with n arguments. As an 
example, we examine the non-expansiveness of our behavioral 
distance with respect to the parallel composition and product 
operators in this section. These operators model two forms of 
joint behavior of some fuzzy systems and we can think of them 
as two types of systems resulting from the interconnection of 
system components. 

To introduce the operators, one further bit of notation will 
be handy: Given e J- (Si), i — 1,2, we use /ii A /i2 to 
denote the possibility distribution on 5*1 x 5*2 that is defined 
by 

(Mi a Ai2)(si, S2) = /^i(si) A M2(S2) 

for all (si, S2) e 5*1 X 82- 

The parallel composition operator is asynchronous in the 
sense that the components can either synchronize or act 
independently. Given an FTS {S,A,S), for any si,S2 G S, 
the events that are intended to synchronize at si and S2 are 
listed in the set Ag^^ D Ag^ and the rest of the events can 
be performed independently, where Ag^ stands for {a £ A : 
3fj. ^ ^, Si — > fi}. Formally, the parallel composition is a 
three-tuple (S x S, A, S'), where for all si,S2 £ S and a £ A, 

S'{si\s2,a) = 

A q : ^I e S{si,a)}, if a e As^\As^ 

{pAi] : T] e S{s2,a)}, if a e ^S2\^si 

{nAi] : ^ e d{si,a),r] € S{s2, a)}, if a e Ag^ D Ag^ 

0, otherwise. 

Clearly, this constructs an FTS, which represents that two 
states of {S, A, 6) are running concurrently. Instead of a pair 
(si, S2) G X 5 we write si\s2 for a state in the composed 
FTS. The synchronization constraint Ag-^ D Ag^ forces some 
events to be carried out at both of the states at the same time 
and allows all the possible interleavings of the other events at 
the two states. 

The definition of product is somewhat simpler than that of 
the parallel composition. Given an FTS {S, A, 5), the product 
is a three-tuple [S x S, A, S"), where S" : (S x S) x A — > 
V{F{Sx S)) is defined by 

5"{si\\s2,a) = 

[ {fiAT]:^e S{si,a),T] G S{s2, a)}, if a e Ag^ n Ag^ 
y 0, otherwise. 

It turns out that {S x S, A, 5") is again an FTS. The product 
requires that the components are strictly synchronous. 

Let us present an example of the parallel composition and 
compute the behavioral distance between the states. 

Example 3: We revisit the FTS shown in Fig. [T] By the 
definition of parallel composition, we can readily get all the 
fuzzy transitions derived from Si\s2 and S2IS3, which are 




Fig. 2. The partial transition graph of parallel composition derived from 
Sl|s2 and S2IS3. 

depicted in Fig. [2] For instance, 

, N 0.9\ 0.91 

S'{s2\s3,a) = — + — A — ^ 

L V ^3 54/ S4 J 

[sslsi S4IS4 J 

, N fo.9 n 

S'{s3\s4,a) = < — A — } 

I Si Si) 

( 0.9 1 



By a routine computation Uke Example |2] we can obtain the 
values of behavioral distance d'j on these states; for example, 

C^/ (54 I S3, S3 1 54 ) = £^/(s4|s3,S3|s3) = (S3 |S4, S3 jsa) 
dj(s2|s3, S4IS3) = d/(s2|s3,S3|s4) = (S2 | S3 , S3 | S3) 

= 0.6, 

dj(si|s2, S2IS3) = d/(si|s2,S4|s3) = dy(si|s2,S3|s4) 
= li'^(si|s2,S3|s3) = 0.9, 

dj(si|s2, S4IS4) = (i/(s2|s3,S4|s4) = (s4 | S3 , S4 1 S4) 
= rf/(s3|s4,S4|s4) = rf/(s3|s3,S4|s4) = 1- 

Comparing with the behavioral distances obtained in Example 
[2] these values evidence that the parallel composition is non- 
expansive. 

More generally, we have the following theorem. 
Theorem 5: Let {S,A,6) be an FTS and Si,ti e S with 
df{si,ti) — Ci, where i = 1,2. Then we have the following: 

(1) d'f{si\s2,ti\t2) < ei Ve2. 

(2) d'/(si||s2,ii||t2) <eiVe2. 

Proof: See Appendix A. ■ 
The above theorem gives rise to an interesting corollary, 
which says that both the parallel composition and product 
preserve bisimilarity. 

Corollary 5: If si ^ ti and S2 ~ t2, then Si|s2 ~ ti\t2 
and si||s2 ^ ti\\t2. 

Proof: It is straightforward by Theorems |4] and [5] ■ 

VI. Conclusion and Future Work 

In this paper, we have constructed a pseudo-ultrametric 
for measuring the behavioral distance between states in an 
FTS. The behavioral distance is a quantitative analogue of 
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bisimilarity. The smaller the distance, the more alike the states 
are. In particular, states are bisimilar if and only if they have 
a distance of 0. We have also shown that for the parallel 
composition and product operators, the behavioral distance is 
non-expansive. Contrary to the exact notion of bisimulation, 
the behavioral distance is much more natural and robust for 
fuzzy systems, which enables us to talk about the approximate 
equivalence of fuzzy systems. 

There are several problems that are worth further study. 
First, developing an algorithm to calculate our behavioral 
distance is desirable. The algorithms for probabilistic systems 
cannot be adapted in a straightforward way to our setting. 
Second, another aspect that should be considered in the future 
is to modify the logic and its interpretation in |'34l, p5) so 
that we can obtain a logical characterization of the behavioral 
distance. This is just a different formal presentation of the 
same thing, but it may help in connecting to the fairly large 
number of works on fuzzy logic. Third, it is interesting 
to discover the relationship between the behavioral distance 
and the greatest bisimulation based on a fuzzy relation in 
p2), |13|. It would be very valuable if one could transfer 



results from one setting to the other. Fourth, noting that 
there is another approach to measuring the behavioral distance 
between states in a probabilistic system by introducing various 
approximate bisimulations (e.g., p6| , p6|-p9|), one may 
pursue the approximate bisimulation approach in a fuzzy 
system as well. Finally, the present work has focused on 
the theoretical presentation of our behavioral distance. The 
application of this measure in analyzing some practical fuzzy 
systems such as fuzzy control systems, discrete-time fuzzy 
systems pO) , and formal models of computing with words 
||?TJ-||43| is left for future work. 

Appendix A 

Proof of Lemma [7} Suppose that the system ([T]) has a 
solution {xst ■ s,t E S}. By contradiction, let us assume 
that fi{S) ^ viS), say ^(5) > ri{S). Then there exists 
s' & S such that /i(s') = fJ.{S). Because fj,{s') = Vtes^s't, 
there is t' G S such that Xg't' = fJ-{s')- Hence, we see 
that Xs't' > > vit')- Furthermore, it yields that 

VsesXst' > Xs't' > v{t'), which contradicts with the second 
equation in the system. Therefore, the necessity holds. 

Conversely, assume that — ri{S). Then there are 

s',t' € S such that fi{s') = fi{S) = r]{S) = T]{t'). For any 
s,t E S, we take 

{^i{s), if t^t' 
■q{t), if s = s' 
0, otherwise. 

Note that Xg't' — /^(s') = vi^') t>y the definition above. 
We claim that {xst : s,i G 5*} is a solution to the system 
([TJ. In fact, it is clear that Xst > for all s,t E S. The 
first two equations in ([T} are analogous, so we merely verify 
the first one. For any s £ S, if s — s', then we have that 
VtesXs't = "-^tesvit) = v{S) fJ^iS) ^ fi{s'); if s ^ s\ we 
get that WtesXst = x^f V iytes\{t'}Xst) = V = ^(s). 



Consequently, it holds that \/tesXst — /^(s) for any s G 5, as 
desired. ■ 

Proof of Theorem [7| We need to check that d satisfies the 
three requirements of pseudo-ultrametric. Let fj,,ri,9 E J^{S). 

For (PI), we see that d{fi, ^) = by setting Xgs — ^J■{s) 
and Xst = for all s ^ t. 

For (P2), if fj.{S) ^ 77(5"), we have that d{^J.,'n) = 
1 = d{ri,fi). Otherwise, without loss of generality, we may 
assume that d^fjL^rj) < d{ri,ii), and suppose that d{^,rj) 
is attained by some {xst : s,t E S}, namely, d{ii^rj) = 
y s,t(^s{d{s,t) Axst)- For all s,i € 5, set y^t = Xts- It is 
easy to see that VteSVst = "^tesxts — vis) for every s E S 
and VsesUst — '^sesxts — fJ-{t) for every t E S. Moreover, 
we obtain that 



\/s,t<£S id{s,t) A Ust) 



Vs,tes (d(s,i) A Xts) 
= ys,tesid{t,s) Axts) 
= Ws,tesid{s,t) Axst) 
= din,!^), 

which means that d{^,rf) > d{ri,ii). It is a contradiction. 
Therefore, d{^, 77) — d{r], fi). 

For (P3), if /i(S') ^ ri{S) or r;(5) ^ d{S), we always have 
that d{^,ri) V d{ri,9) = 1 > d{fi,9). Otherwise, we see that 
fi{S) — 7]{S) — 9{S). In this case, suppose that d{fi, r/) = 
Vs,tes{d{s,t) Axst) and d{ri,e) = V s,teS {d{s,t) A yst)- 
Then we have the following equations: 

(4) 
(5) 
(6) 
(7) 

For all s.t E S, set Zst — Vresixsr A yrt)- For any given 
s e S", we get that 

"^teSZst = Vte5 V,.g5 {Xsr A yrt) 
= ^reS Vte5 {Xsr A yrt) 
VresiXsr A (Vtgsyrt)] 

Vresixsr A ?7(r)] 

^reSXsr 



'^tesXst 




VseS 


'^sesXst 


^vit), 


ytES 




= V{s), 


VseS 




= 0{t), 


ytES 



i 
i 




namely, Vtes^st — A'(s). At the same time, for any t E S, we 
have that 

"^seSZst = Vsgs VreS [Xsr A yrt) 

= Vres Vse5 {xsr A yrt) 
'^resii'^sesXsr) A yrt] 
"^resblir) A yrt] 

'^reSUrt 
0it), 







i.e., Vses^st — S{t). Moreover, we see that d{fi,d) < 
d{ij., rj) V (i(?7, 0) by the inequality at the top of the next page. 
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d{^i,e) < \/sjes{d{s,t) Az,t) 

= \/s,tes[d{s,t) A{\/res{xsr /\yrt))] 

< V^.tes Vre5 [(d(s, r) V d(r, t)) A x,r A y,.t] 

= Vs,tes V-res [(^(s, r) A a;sr A Urt) V (d(r, i) A Xsr A ?/rt)] 

< V,,tes V^,e5 r) A a;,,.) V (d(r, A y^)] 

= [Vs.tes Vres (d(s, r) A Xj^)] V [Vj^tgs V^eS A yrt)] 

= [\/s,res{d{s, r) A x^r)] V [\/r,tes{d{r, t) A y^)] 

= d{^i,ri)y d{ri,e). 



Therefore, d is a pseudo-ultrametric on T{S), completing the 
proof of the theorem. ■ 
Proof of Lemma |2]- For any X C 2?, we define \1X by 



{nx) {s,t) = w{d{s,t) -.dex} 



and UX by 



ux = n{deV: yd' ex,d' ^ d}. 

In particular, (n0) (s, t) = for all s, t e S", and (U0) (s, t) = 
if s = 1 otherwise. We need to verify that nX and UX 
are the infimum and supremum of X, respectively. We only 
prove that FIX is the infimum, since the supremum can be 
proved similarly. 

We first show that nX E T). In fact, it is obvious that 
{nX) (s, s) = and (UX) (s, t) = (UX) {t, s) for all s,t G S. 
For (P3), we have that 

{nX)is,t) = Vdexdis,t) 

< Vdexidis,r)V d{r,t)) 

= (V<iGxrf(s,r)) V(Vdexd(r-,i)) 

= [{nx){s,r)]v[{nx){r,t)], 

for any s,t,r E S. Hence, FIX is a pseudo-ultrametric. 

Observe that {nX){s,t) > d{s,t), that is, nX ^ d, for 
all d £ X. Therefore, nX is a lower bound of X. On the 
other hand, for any h £ T), \f h < d, i.e., h{s,t) > d{s,t), 
for all d E X, then it is easy to see that h ^ FIX, which 
means that nX is not less than any other lower bound of 
X. Consequently, FIX is the infimum of X. This proves that 
(2?, ^) is a complete lattice. ■ 

Proof of Lemma We first prove the following claim: For 
any given di,d2 S 2?, if di < d2, then it holds that 77) > 
^2(^,77) for all ^,77 e J^iS). 

In fact, we have that di{s,t) > d2{s,t) for all s,t E S, as 
di di d2- Suppose that rj) is achieved by {xst ■ s,t E S}. 
Clearly, {xst s,t E S} is also a feasible solution to (MPi 
defining d2{iJi,rj). As a result, we obtain that 

77) = Vs,tes (rfi(s,i) A aist) 

> Vs^tes (rf2(s,i) A a;st) 

> d2{f^,v), 

as desired. 



The proof of the lemma is straightforward by the previous 
claim and thus omitted here. ■ 

Proof of Corollary^ According to Tarski's fixed point theo- 
rem, the greatest fixed point A,„ax can be obtained by iteration 
of A starting from the greatest element T (see, for example, 
|44|). Hence, to show that A,„ax = n{A"(T) : n e IN}, we 



only need to prove that the closure ordinal of A, i.e., the least 
ordinal n such that A"+^ = A", is at most uj. In fact, for any 
{s,t) E S X S, if s /i, then for each d„ = A"(T), there 
exists 77„ such that t rjn and dn{fi, rjn) < dn{s, t). Thanks 
to the image-finiteness of the FTS, there is an 77^, say 77, such 
that t — > rj and (i„(/i, 7;) < (i„(s, t) for all but finitely many 
n, as desired. ■ 
Proof of Lemma^ The equivalence of (1) and (2) follows 
immediately from Definition |8] Therefore, we only prove the 
equivalence of (2) and (3). 

(2) (3). By the condition of (2), we define the weight 
function w by setting 

7i;(s, t) = V{cj : (s^, ti) = {s,t),i E 1} 

for each (s, t) E S x S.lt remains to check that such a weight 
function satisfies the requirements (a), (b), and (c) of (3). 

(a) For any s G 5, we have that 

Vtgsw(s,i) = Wtes^ {ci ■■ {si,ti) ^ {s,t),i E 1} 
= V{ci : = s,i E 1} 
= Wieiici A Si{s)] 
= 

(b) The proof is similar to that of (a). 

(c) If w{s,t) > 0, then by the definition of w there exists 
some i E I such that Ci > and {si,ti) — {s,t). We see 
that (s,<) E R, since {si,ti) E R. 

(3) (2). Suppose that there is a weight function w 
satisfying the conditions (a), (b), and (c) in (3). Then we 
take / = {(s,t) : w{s,t) > 0, (s,i) E S x S} and set 
C(s,i) = w;(s, t) for each (s, t) E I. Clearly, for any (s, t) E I, 
we have that w{s,t) > 0, which implies {s,t) € i? by the 
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condition (c) in (3). Moreover, for any s' E S, we obtain that 

( U ■ = V(3^t)g/(w(s,t) As(s')) 

(s,t)6/ 

= V{u;(s', t) : w{s', t) > 0,t <E S} 

= \/{w{s',t) -.t E S} 

= f^is'), 

U '^(s,t) ■ s. By the same token, 

(s,t)e/ 

U C(s.4) • Hence, (2) holds. This 

{s,t)ei 

completes the proof of the lemma. ■ 
Proof of Lemma |6]- We first show the necessity. Suppose 
that (/i,?7) e R. Then by Lemma [s] there are {si,ti) £ R and 
Ci E [0, 1], i E I, such that fi — [J Ci ■ Si and rj — [J Ci ■ U. 

i£l iei 

For any C E S/R, we have that 



which means that /i 
we can get that rj ~ 



m(c) 



(^) 



= Vsec V {ci : i e /, Si = s} 
= V{ci : i E I,Si E C}. 

In a similar vein, we can get that 77(C) — V{ci : i E I,ti E 
C}. As {si,ti) E R, we see that S C if and only if ti E C. 
Consequently, it yields that /i(C) — r]{C). 

Next, to see the sufficiency, assume that ^{C) = 'rj{C) for 
every C E S/R. Take / {(s, t) : (s, t) E R} and let C(^.t) = 
/i(s) A77(t) for each (s, i) E I. For any (s, t) E I, it is obvious 
that {s,t) E R. Furthermore, for any s' E S, we have that 

( U -(.tyS){s') 
(s,t)ei 

= y(s,t)ei{c(s,t) ^ s{s')) 

= V{fi{s')Ar]{t):{s',t)ER,tES} 

= w{^Iis')A7^{t)■.tE[s']} 

= /i(s') A (V4e[,,]r7(t)) 

where we write [s'] for the equivalence class containing s'. 
It gives that /i = IJ C(^s,t) ■ s. Similarly, we can get that 

f] = y C(s,t) ■ t- Whence, we have that {fi, r/) e i? by 

Lemma p\ finishing the proof. ■ 
Proof of Lemma First we prove the sufficiency by con- 
structing a weight function w : S x S — > [0, 1] that satisfies 
the requirements (a), (b), and (c) of Lemma |5] Suppose that 
77) = and it is attained by {xst '■ s,t E S}. It means that 
Vs.tes {dis, t) A Xst) = 0, ytesXst = mIs) for any s E S, and 
y s&sXst = v{t) for any t E S. Simply taking w{s,t) — Xst, 
we see that the requirements (a) and (b) of Lemma |5] are 



satisfied. If w{s,t) = Xst > 0, then it forces by the argument 
Vs,teS {d{s,t) A Xst) = that d{s,t) = 0, which implies that 
(s, t) E Rhy It thus follows by Lemmajsjthat {fi, rj) E R. 

Now, let us show the necessity. Suppose that {fi, rf) E R. 
Then by Lemma [5] there is a weight function w : S x S — !■ 
[0,1] such that yt^sw{s,t) = /i(s) for any s E S and 
Vsg5iy(s,i) — ri{t) for any t E S. Moreover, w{s,t) > 
implies (s,t) E R, which means by (j2]| that 'w{s,t) > 
implies d{s,t) = 0. Taking Xgt = w{s,t) for all s,t E S, 
we see that {xst s,t E S} is Si feasible solution to (MPi. It 
follows that d{fj., rj) — since ys,teS {dis, t) A Xst) = 0. This 
finishes the proof. ■ 

Proof of Theorem |5j We first show the sufficiency. Assume 
that dfl is a post-fixed point of A. For any {s,t) E R and 
a E A, \f s — 2-> /i, then by Lemma [9] there exists some 77 
such that t rj and 77) < dii{s,t). In order to show 

that i? is a bisimulation, it remains to prove that /i(C) = 
77(C) for any equivalence class C E S/R. In fact, we see that 
dnifi, 77) = since dji{s, i) = by the definition of du. Note 
that by Lemma [S] do is a pseudo-ultrametric on S. It thus 
follows from Lemma M that (/i, 77) G i?. As a result, we obtain 
by Lemma [6] that /i(C) = 7;(C) for any C E S/R, as desired. 

For the necessity, suppose that i? is a bisimulation. We now 
proves dR d: ^idn) by Lemma |9] Let {s,t) E S x S and 
a E A. If {s,t) ^ R, then 1^7^(5, t) = 1 by the definition 
of du and the condition of Lemma |9] is naturally satisfied. 
On the other hand, if (s, t) E R and s /i, then by the 
definition of bisimulation there exists some rj such that t — > rj 
and /i(C) = 77(C) for all C E S/R. The latter implies that 
(/i, 77) e i? by Lemma ^ Whence, we see by Lemma [t] that 
djj (ji,ri) — 0, which means that the condition of Lemma [9] 
holds. Therefore, d^i is a post-fixed point of A, finishing the 
proof of the theorem. ■ 

Proof of Theorem |?J For the 'only if part, suppose that 
s ^ t. Then there is a bisimulation R that contains {s,t). 
Thus, we get by Theorem |3] that dji is a post-fixed point of 
A with dii{s,t) — 0. By the definition of df, we have that 
dn di df. Consequently, df{s,t) < dii{s,t) = 0, which forces 
that df{s,t) = 0. 

For the converse, consider the relation R defined by i? = 
{{s,t) E S X S : df{s,t) = 0}, namely, (s,<) E R if and 
only if df{s,t) = 0. Clearly, R is an equivalence relation. It 
remains to show that this equivalence relation is a bisimulation. 
For any J_s,i) E R and a E A, if s — 72, we obtain by 
Lemma |9| that there exists some 77 such that t — 2-> rj and 
df{fj,, rj) < df{s, t), because the greatest fixed point df is also 
a post-fixed point of A. As (s, t) E R, we see that df{s,t) = 
and thus df{^,ri) = 0, which implies that (71,77) e i? by 
Lemma[7] It follows from Lemma[6]that 71(C) — rj{C) for all 
C E S/R. This proves that R is bisimulation, completing the 
proof. ■ 

Proof of Corollary |4j As s ^ s' and t ^ t' , we see by 
Theorem [4] that df{s,s') = df{t,t') = 0. Note that df is a 
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< \/s\s'.t\t'eSxs [Dis\s', t\t') A Zs\s'.t\t'] 

= V,,,-,t^t-eS Kdf{s, t) V df{s', t')) A (xst A ys't')] 

= ys.s',ut'es [{df{s, t) A x,t A ys't') V (d/(s',i') A x.t A y^'t')] 

< ys.s',ut'es [{dfis, t) A Xst) V {df{s',t') A y.-f)] 

= [V,,t6s(d/(s,t) A V [W s' ,t'es{df{s\t') A y.-tO] 

= df{fi,ri)V df{iJ.',r]') 

< df{s,t)V df{s',t') 
= D(s|s',i|t'). 



pseudo-ultrametric. We thus get by (P3) that 



following: 



df{s,t) < df{s,s)V df{s ,t) 
= dfis',t) 
< df{s' ,t')y df{t\t) 
= df{s' ,t')\J df{t,t') 
= d/(s',t'), 

namely, df{s,t) < df{s',t'). The converse can be proved in 
the same way. Thereby, df{s,t) = df{s',t'), as desired. ■ 

Proof of Theorem |5]- We only prove the first assertion; the 
second one can be proved similarly. 

For (1), consider the parallel composition [S x 5, A, 5'). Let 
us first define a function D : {S x S) x {S x S) — > [0, 1] by 
setting 

D{s\s',t\t') ^ df{s,t)\J df{s' ,t') 

for any s\s' ,t\t' E S x S. It is easy to check that _D is a 
pseudo-ultrametric on S" x S". We claim that £> is a post-fixed 
point of A'. 

Let us verify the claim by using Lemma [9] For any given 
{s\s',t\t') € {S X S) X (S X S) and a e A, suppose that 
s\s' — ^ fi A /i'. We need to show that there exists some rjArj' 
such that t\f -^r]Ari' and A /i', 77 A r?') < D{s\s', t\t'). 
We have to discuss four cases: a E As\As', a E As'\As, 
a E AsD As', and a ^ AgU As'. We only go into the details 
of the third case; the other cases are simpler and can be proved 
in a similar way. 

In the case of a E As O As', we get by the definition of 
parallel composition that there are s — ^ /i and s' — ^ /j,'. 
Because the greatest fixed point of A is a post- fixed point 
as well, by Lemma [9] there are rj and 77' such that t rj 
and t' 77'. Moreover, we get that df{ij,ri) < df{s,t) 
and dj:{fi',ri') < df{s',t'). It yields that t\t' 7/ A 77' and 
remains to verify that Z)(/i A /i',77 A rj') < D{s\s' ,t\t'). Two 
subcases need to be considered: One is that both fJ,{S) = 77(5') 
and /i'(S') = ?7'(S'); the other is that either ^ r]{S) or 

fl'iS) ^ 7/(5). 

In the first subcase, we may assume that df{ii,rj) and 
df{ji' ,rj') are attained by {xst : s,t E S} and {ys't' ■ s',t' E 
S}, respectively, that is, df{fj,,r]) = V s,tesidf{s,t) A Xst) 
and df{fi',ri') — V s' .t'es{df{s' ,t') Ays't')- Then we have the 





= ^J■{s)l 


Vs e 5 


(8) 


'^sesXst 


= v{t), 


ytES 


(9) 


^t'esUs't' 


-m'(s'), 


Vs' E S 


(10) 


ys'esUs't' 


-iit'), 


Vt' E S 


(11) 



For all s\s',t\t' E S x S, set Zs\s',t\t' 
sis' E S X S, we find that 



Xst A ys't'- For any 



Vt|t'esxs^s|s',t|t' = \/t,t'<£sixst Ays't') 

= "^tesixst ^ i'^t'esys't')] 

^ \/tes[xstAji'{s')] 

= {WteSXst) A n'{s') 

i /i(s)AM'(5') 
= OiAji'){s\s'), 

namely, Vt\t'eSxSZs\s' ,t\t' = (mAm')('5|s')- On the other hand, 
for any t\t' E S x S, we have that 

Vs|s'eSxsZs|s',t|t' = '■^s^s'esixst Ays't') 

= "^sesixst ^ {'^s'eSVs't')] 

^ '^sesixstAv'it')] 
= {Vsesxst) Arj'it') 

i v{t)Av'{t') 

= {vAv')m, 

i.e., Vs\s'esxszs\s',t\t' = ivAv'){t\t')- As a result, {zs\s'^t\t' ■ 
s\s',t\t' E S X S} is a feasible solution to the mathemati- 
cal programming problem (MP) defining Z)(/i A /i',77 A v')- 
Therefore, we get that l)(/i A /i',77 A 77') < D{s\s', t\t') by the 
inequality at the top of this page. 

In the second subcase, we see that df{fi,7j) = 1 or 
df{jx',ri') = 1. It follows that df{s,t) = 1 or df{s',t') = 1, 
since df{fi,ri) < df{s,t) and df{ji',ri') < df{s',t'). There- 
fore, we get that D{s\s',t\t') = df {s,t) V df {s',t') = 1 > 
A fi' ,1] A 77'). This completes the proof of the claim. 

Based on the claim, it follows by the definition of d'j that 
D < d'p which means that 

d'f{si\s2,ti\t2) < D{si\s2,ti\t2) 

= df{si,ti)\J d}(s2,t2) 

= eiVe2. 
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Consequently, the first assertion holds, finishing the proof. ■ 
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